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FOREWORD 

In  order  that  more  meaningful  data  can  be  obtained  in  certain 
proposed  sensitivity  tests  associated  with  nuclear  weapon  vulner¬ 
ability  studies,  a  study  of  old  methods  in  comparison  with  new 
ones  was  deemed  desirable. 

The  results  presented  here  will  aid  an  experimenter  in 
determining  the  feasibility  of  using  stochastic  approximation 
techniques.  Such  techniques  have  w;  de  application  in  industry 
and  their  use  is  not  confined  to  the  evaluation  of  weapon  systems. 

Work  on  this  report  was  done  under  the  tasks  assigned  by  the 
Bureau  of  Naval  Weapons  Instruction  5450.17. 

The  report  was  reviewed  for  technical  accuracy  br  Charles  E. 
Antle,  Statistics  Laboratory,  Oklahoma  State  University,  and 
Dr.  Vanamamalai  Seshadri  of  Southern  Methodist  University. 


EDWARD  BAKL1NI 
Head,  Applied  Science  Group 
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ABSTRACT 


The  rates  of  convergence  of  three  stochastic  approximation 
estimators  are  studied  empirically  using  a  Monte  Carlo  sampling 
procedure.  The  results  are  presented  in  tab-  ■>*  form  and  various 
conclusions  are  made  as  to  the  utility  of  each  ' imat; or  in  the 

light  of  these  results. 
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INTRODUCTION 

Sensitivity  testing  deals  with  a  continuous  variable  which 
cannot  be  determined  in  practice.  For  example,  suppose  it  is 
desirable  to  know  the  amount  of  mass  of  a  high  explosive  such 
that  the  probability  that  an  explosive  response  will  occur  when 
the  mass  is  subjected  to  a  jet-fuel  fire  is  less  than  some  spec¬ 
ified  level,  say  a.  There  are  levels  of  mass  at  which  less  than 
100a  percent  will  respond  and  levels  where  more  than  100a  per¬ 
cent  will  respond.  Clearly,  the  critical  value  of  mass  at  which 
exactly  100a  percent  will  respond  cannot  be  measured.  All  one 
can  do  is  select  a  sample  arbitrarily  and  determine  whether  the 
critical  value  for  a  sample  is  less  than  or  greater  than  the 
mass  of  each  element  of  the  sample. 

This  situation  arises  in  many  fields  of  research.  In  selec¬ 
ting  insecticides,  a  critical  dose  is  associated  with  each  insect 
but  cannot  be  measured.  One  can  only  try  some  dose  and  observe 
whecher  or  not  the  preassigned  percentage  of  insec _s  are  killed, 
i.e.,  observe  whether  or  not  che  desired  dose  for  the  insect  is 
less  than  the  chosen  dose.  The  same  difficulty  arises  in  phar¬ 
maceutical  research  dealing  with  germicides,  anaesthetics,  and 
other  drugs,  in  testing  strengths  of  materials,  and  in  several 
areas  of  engineering  and  developmental  research. 

In  true  sensitivity  experiments,  it  is  not  possible  to  make 
more  than  one  observation  on  a  given  specimen.  Once  a  test  has 
been  made,  the  specimen  is  altered  (e.g.,  the  explosive  is  de¬ 
stroyed,  the  insect  weakened)  so  that  a  bona  fide  result  cannot 
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be  obtained  from  a  second  test  on  the  sa.  ;  specimen.  The  common 
procedure  in  experiments  of  this  kind  is  to  divide  the  sample  of 
specimens  into  several  groups  (usually,  but  not  necessarily,  of 
the  same  size)  and  to  test  one  group  at  a  chosen  level,  and  a  sec¬ 
ond  group  at  a  second  level,  etc.  The  data  consist  of  the  numbers 
affected  and  not  affected  at  each  level.  Several  methods  of  ana¬ 
lyzing  such  data  (variously  called  sensitivity  data,  all-or-none 
data,  or  quantal  responses)  are  available  (Ref.  1  and  2). 

Most  of  the  methods  commonly  used  are  applicable  only  in 
special  cases,  most  of  which  are  based  on  various  assumptions 
concerning  the  distributions  of  the  estimators,  especially  if 
confidence  limits  are  desired.  A  method,  devised  relatively 
recently  (and  seldom  used  for  various  reasons),  is  available  to 
the  experimenter  in  which  he  may  estimate  any  critical  value  in 
its  range  with  some  assurance  that  after  a  large  number  of  trials 
the  estimator  will  approximate  closely  the  desired  critical  value. 
The  method,  called  a  stochastic  approximation  method,  was  formu¬ 
lated  by  Robbins  and  Monro  and  published  in  1951  in  the  Annals 
of  Mathematical  Statistics  (Ref.  3). 

Briefly  stated,  stochastic  approximation  is  concerned  with 
the  regression  of  a  variable  y  on  a  variable  x,  and  seeks  the 
value  x  =  G  for  which  the  regression  value  of  y  is  some  preas¬ 
signed  number,  y  =  a.  The  estimation  procedure  for  9  is  sequen¬ 
tial  and  distribution-free .  Despite  its  extreme  simplicity  in 
application  and  the  wide  variety  of  the  situations  in  which  it 
may  be  useful,  the  technique  has  not  been  taken  advantage  of  bv 
empirical  research  workers.  One  reason  for  this  may  be  that  the 
existing  literature  is  addressed  primarily  to  the  professional 
mathematician.  Another  reason  may  be  that  the  mathematical  the¬ 
ory  itself  is  not  yet  complete  for  relatively  small  samples. 
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A  desirable  feature  of  stochastic  approximation  is  the  lack 
of  assumptions  required.  In  many  problems,  the  researcher  has 
no  clear  picture  of  the  structure  of  che  relationship  he  wishes 
to  study  and  would  prefer,  if  possible,  not  to  commit  himself  to 
hypothesize  the  precise  shapes  of  the  regression  or  other  distri¬ 
bution  features.  In  such  cases,  he  needs  a  procedure  which  is 
dis  tribution- free . 

Theoretically,  the  problem  reduces  to  solving  the  regression 
equation 

(1)  M(x)  =  a 

This  problem  has  been  studied  bv  Robbins  and  Monro  (Ref.  3), 

Blun.  (Ref.  4),  Keston  (Ref.  5),  and  others  (Ref.  6,  7,  and  8). 
Using  the  notation  of  Robbins  and  Monro,  M(x)  denotes  the  ex¬ 
pected  value  at  level  x  of  the  response,  say  Y,  of  a  certain 
experiment.  M(x)  is  assumed  to  be  a  continuous  monotone  func¬ 
tion  of  x,  but  is  unknown  to  the  experimenter,  and  it  is  desired 
to  find  the  solution  X  =  0  of  the  equation  M(x)  =  a  where  a  is 
a  given  constant.  The  Robbins  and  Monro  method  is  one  in  which 
successive  experiments'  are  performed  at  levels  X^,  X^,  ...  in 
such  a  way  that  X^  will  tend  to  9  in  probability. 

Except  for  an  unpublished  study  by  Teichrow^  and  an  applica¬ 
tion  of  the  Robbins  and  Monro  technique  described  by  Louis  and 
Ruth  Guttman  (Ref.  9),  little  is  available  to  the  experimenter 
to  guide  him  in  the  use  of  stochastic  approximation  methods. 

The  purpose  of  this  report  is  to  give  the  experimenter  information 

^  Teichrow,  D.,  "An  Empirical  Investigation  of  the  Stochastic 
Approximation  Merhod  of  kobbins  and  Monro." 
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which  will  aid  him  in  determining  the  feasibility  of  using  stochas¬ 
tic  approximation  methods;  and  also,  if  he  decides  to  use  the 
techniques,  in  determining  which  of  the  three  available  estimators 
he  should  use.  The  proofs  that  two  of  the  three  estimators  con¬ 
verge  with  probability  one  to  the  desired  value  are  available  in 
statistical  literature  and  will  not  be  discussed  here. 

The  report  is  divided  into  two  parts.  The  first  is  a  discus¬ 
sion  and  description  of  the  estimators.  The  second  part  is  an 
empirical  comparison  of  the  convergence  properties  of  the  three 
estimators . 

Since  the  form  of  M(x)  is  not  known  to  the  experimenter,  the 
means  used  here  to  study  the  convergence  properties  is  to  employ 
a  Monte  Carlo  sampling  scheme  to  simulate  a  test  in  which  stochas¬ 
tic  approximation  methods  will  be  used.  Upon  repeated  simulations 
of  trials  for  various  forms  of  M(x),  various  convergence  proper¬ 
ties  of  each  of  the  three  estimators  can  be  observed. 

The  primary  interest  here  lies  in  sensitivity  testing,  some¬ 
times  called  quantal  response  testing;  therefore,  the  empirical 
study  made  is  a  simulation  of  this  type  of  testing.  A  similar 
study  could  be  made  by  assigning  a  continuous  distribution  func¬ 
tion  to  the  observed  random  variable  Y(x). 


THREE  STOCHASTIC  APPROXIMATION  ESTIMATORS 

For  each  real  number  x.  let  Y(x)  be  a  random  variable  such 
that  E[Y(x)1  =  M(x)  exists.  Assume  that  the  regression  equation 
M(x)  =  a  has  a  singly  root  at  x  =  0,  which  is  to  be  estimated, 
and  that  (x  -  0)[M(x)  -  al  >  0  for  all  x  £  9.  An  initial  value 
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x^  and  a  sequence  [c^]  of  positive  numbers  are  selected.  The 
(j  +  l)st  approximation  to  9  is  defined  inductively  by  the  re¬ 
cursive  formula 


(2)  x.+1  -  +c.(«  -  yj) 

where  y .  is  the  observed  value  of  the  random  variable  at 
x  =  x.  .  The  letter  j  denotes  the  trial  number. 


Each  of  the  three  estimators  can  be  written  in  the  form  of 

Eq.  2.  However,  the  difference  lies  in  the  way  the  sequence 

Cc. 3  is  defined. 

J 

The  sequence  te. J  which  defines  estimator  I  (the  Robbins— 

J 

Monro  estimator)  is  a  fixed  sequence  of  positive  elements  with 
the  following  properties: 


(a)  E  c.  =  ® 

.1=1  J 


(b) 


2 

2  cf  <  ® 

j=l  J 


The  sequence  [ 1/ j 1  has  these  properties. 


The  second  estimator  (estimator  II 
defined  by  Eq.  1,  where  the  sequence 
following  way  from  the  sequence 


proposed  by  Keston)  is 
[0^ ]  is  defined  in  the 


=  a. 


c 


2 


a 


2 


c. 

J 


a 


t(j) 
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J 

where  t(j)  =  2  +  T  6[(x.  -  x.  .. )  (x.  - 
i=3  1  1“1  1"i 

and  t (x)  =1  if  x  <  0 
=0  if  x  >  0 


Xi-2>] 


Thus 


every  time  (x..  -  x.  differs  in  sign  from 


(x.  -  x.  another  a,  is  taken.  A  further  restriction  on 

j-1  j-2'*  k 

the  sequence  .a^l  other  than  the  properties  (a)  and  (b)  is 


(c) 


ak+l  ~  ak 


It  is  important  to  note  tnat  the  elements  of  [c.]  for 
j  >  2  are  random  variables. 

Keston's  rule  for  selecting  the  members  of  Cc^]  is  based  on 
the  conjecture  that  in  the  neighborhood  of  x  =  9,  0  being  the 
solution  of  Eq.  1,  it  seemed  likely  that  frequent  fluctuations 
in  the  sign  of  (x.  -  9)  -  ^xj+i  “  ©)  *  xj  ”  xj+i  indicate  that 
|x.  -  9 1  is  small  where  a  few  fluctuations  in  the  sign  of 


Xj  -  x.+^  indicate  that  x^  is  far  away  from  9. 

It  can  be  shown  that  there  exists  a  9' ,  not  necessarily 
identical  with  0,  where  fluctuations  in  the  sign  occur  more 
frequently  in  &  finite  number  of  trials.  The  value  x  =  9'  is 
defined  by  the  intersection  of  the  line  Y(x)  =  a  and  the  locus 
of  the  medians  of  the  densities  dH(y  |  x)/dy  for  any  x.  It 
should  be  noted  that  if  the  density  dH(y  |  x)/dy  is  symmetric, 
then  Keston's  conjecture  is  obviously  correct.  Even  though  the 
fluctuation  would  be  expected  to  occur  at  0'  instead  of  9,  this 
does  not  affect  the  convergence  in  probability  of 


(3) 


x.,,  =x.  +  c. (a  -  y.) 
J+l  J  J  J 


to  0,  as  Keston  has  proved. 
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Let  x^.  be  the  value  such  that  the  variation  in  the  algebraic 

sign  of  x.  -x..,  is  maximum.  Suppose  that  x.  ,  <  x. .  In  order 
1  Jfl  J-l  J 

for  a  variation  in  the  sign  to  occur,  x. , ,  <  x. :  where  x.,,  is 

6  J+l  3  J+l 

defined  by  Eq.-  3. 

Let  U  denote  a  random  variable  whose  density  is  the  point 
binomial.  The  variable  U  takes  on  the  value  unity  with  the 
probability  P^  where 

(4)  P  =  Pr[X.  .  <  x.  |  x.  ,  <  x. ] 

x  j+l  J  J-l  J 

From  Eq.  3,  it  follows  that 

(5)  Pv  =  Pr[Y(x  )  >  a] 

X  J 

Clearly,  U  has  maximum  variance  at  Px  =  1/2.  Therefore, 
that  value  of  x  such  that 

(6)  Pr[Y(Xj)  >  a]  =  1/2 
is  the  desired  value  of  0' . 


If  x.  ,  >  x. , 
J-l  J 

that  the  value  of 


a  similar  argument  leads  to  the  conclusion 
x  such  that 


(7)  Pr[Y(Xj)  <  a]  =  1/2 

is  the  desired  0'.  Hence,  0'  is  the  value  of  x  defined  by  the 
intersection  of  the  line  M(x)  =  a  and  the  locus  of  the  medians 
of  dH(Y  |  x)/dy. 


Since  the  sequence  fx^]  converges  to  0  with  probability  one, 
there  exists  a  J  such  that  for  all  j  >  J 


PrCSup  |X.  -  6|  <  | 8  -  0' |!  =  1  -  *  0'  /  8  and  e  >  0 

X  •  1 

J 


That  is,  there  exists  a  neighborhood  of  0  which  does  not 
contain  0'  such  that  after  some  trial  number  N  almost  surely 
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all  x^  will  lie  inside  the  neighborhood.  Hence,  there  will 
exist  almost  surely  only  a  finite  number  of  sign  changes  in  a 
neighborhood  of  0'  if  0  is  net  in  the  neighborhood  of  0*.  But, 
for  a  finite  number  of  trials,  the  experimenter  cannot  be  assured 
that  the  sign  changes  are  occurring  in  the  neighborhood  of  0 
or  0'. 

In  order  to  obtain  an  indication  of  how  this  fact  would 
affect  the  sequence  [c.  ],  consider  the  difference  between  the 
median  and  means  of  two  rather  common  skewed  densities:  the 
triangular  and  the  gamma. 


Consider  first  the  following  form  of  the  triangular  distri¬ 
bution: 


f(x)  = 


cb 

2 


c(c-b) 


(c  -  x) 


0  <  x  <  b 

b  £  x  <  c 


Table  1  presents  values  of  the  ratio  of  the  median  to  c, 
the  ratio  of  the  mean  to*  c,  and  their  difference  for  various 
values  of  b/c.  Note  that  for  small  values  of  c,  the  difference 
between  the  median  and  the  mean  can  be  slight. 


Table  2  presents  the  ratio  of  the  median  to  p,  the  ratio  of 
the  mean  to  p ,  and  their  difference  for  various  values  of  a, 
when  the  gamma  density  is  of  the  following  form: 


pa+ir(a  +  1) 


x  >  0 
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TABLE  1.  Comparison  of  the  Mean  and  Median  for 
the  Triangular  Density  Function 


b/c 

Median/ c 

Mean/c 

Difference/c 

.5 

.  500 

.500 

.000 

.6 

.548 

.533 

.015 

.7 

.592 

.567 

.025 

.8 

.632 

.600 

.032 

.9 

.671 

.633 

.038 

1.0 

.707 

,667 

.040 

TABLE  2.  Comparison  of  Mean  and  Median  for 
the  Gamma  Density  Function 


a 

Median/ e 

Mean/ p 

Difference/p 

0 

.693 

1.000 

.307 

1 

1.678 

2.000 

.322 

2 

2.674 

3.000 

.326 

3 

3.672 

4.000 

.328 

4 

4.671 

5.000 

.329 

5 

5.670 

6.000 

.330 

6 

6.670 

7.000 

.330 

7 

7.669 

8.000 

.331 

8 

8.669 

9.000 

.331 

9 

9.669 

10.000 

.331 

10 

10.669 

11.000 

.331 
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From  the  data  in  Table  2,  it  appears  that  even  for  small 
values  of  p,  the  difference  between  the  median  and  the  mean  can 
be  relatively  large. 

It  should  be  noted  that  the  mean  and  the  median  are  identical 
in  the  binomial  distribution  if,  and  only  if,  p  =  1,  0,  or  1/2 
where  p  +  q  =  1.  The  importance  of  the  binomial  distribution  is 
that  it  is  the  basic  distribution  for  quantal  response  problems. 

It  is  hard  to  justify  the  use  of  an  estimator  computed  from 
a  small  number  of  trials  simply  because  it  is  known  to  converge 
to  the  desired  value  as  the  number  of  trials  increases  v.Tithout 
bound.  The  fact  that  no  other  estimators  have  been  proposed 
and  found  better,  in  some  sense  could  be  a  just  reason  for  using 
the  stochastic  approximation  estimator.  Therefore,  it  seems 
desirable  to  compare  the  two  stochastic  approximation  estimators 
previously  described  with  an  estimator  (estimator  III)  which 
seems  to  be  the  one  which  would  be  most  naturally  proposed  by  an 
experimenter  who  had  no  knowledge  of  the  Robbins— Monro  or  the 
Keston  estimators. 

An  experimenter  who  wishes  to  determine  an  x  such  that 
M(x)  =  a  would  most  logically  select  an  x^  which  he  would  con¬ 
sider  as  being  close  to  the  desired  value  and  then  compare  the 
random  variable  Y(x^)  with  a. 

If  Y(x^)  exceeded  a,  then  x^  <  x^  would  be  selected  accord¬ 
ing  to  the  magnitude  of  a  -  Y(x^).  Similarly,  if  Y(x^)  was  less 
than  a,  x^  >  x^  would  be  selected.  Clearly,  if  Y(x^)  =  a,  the 
experimenter  would  continue  testing  at  x^.  If  after  j  tests 
Y(Xj  ^)  <  a  and  Y(x^)  >  a  or  Y(x^_^)  >  a  and  Y(x^.)  <  then  it 
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seems  logical  that  the  experimenter  would  interpolate  in  order 
to  obtain  x^^.  Also,  it  seems  a  desirable  procedure  to  shorten 
the  steps  that  one  takes  after  each  trial  in  a  small  neighborhood 
of  the  desired  value  of  x.  A  modification  of  Keston's  procedure 
for  shortening  the  step  length  seems  intuitively  adequate. 


Mathematically,  this  procedure  can  be  described  by  the 
recursive  formula,  Eq.  1,  where  c^  is  an  element  of  a  sequence 
[Cj]  defined  by  the  following  rule: 


C1  ~  al 
c„  =  a„ 


If  c. 


1-1 


=  a  for  k  >,  2,  then 


c.  = 
J 


ak  when  a  t  (y. ,  yj_1) 
-  Xj.p/Oj  -  Jj.j) 


when 


“ 8  <yr  yj-i> 


Cj+1 


when  c.  =  a,  and  a  t  (v.,, .  y.) 
c  j  k  vj+i’ 

|(xj+1  -  x.)/(y.+1  -  Yj)  when  a  c  (yj+1>  Yj) 

ak+i  vhen  a  *  (yj+r  yj} 

and  Cj  =  (Xj  -  x^_1)/(y^  -  y^) 


When  a.  is  an  element  of  a  sequence  [a.  ]  having  the  following 
k  K 

properties : 


(a) 

.k>  ° 

for  k  =  1,  2, 

(b) 

*k  '  Vr 

for  k  =  1 ,  2 ,  ... 

(c)  t  a  =  - 
1  R 

(d)  I  a^  <  » 
1  K 
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That  is,  if  a  e  (y,,  y. 


,).  then  x.  ..  is  obtained  by  linear 

J  .  *  '**•  J 

interpolation.  A  n,w  is  selected  after  each  period  of  linear 

interpolation.  An  end  of  a  period  occurs  if  a  e  (y^ ,  but 

a  i  (y.,,,  v. ) ;  hence,  c .  , .  is.  the  next  unused  element  of  the 
j+i  3  j  *-t 

sequence  [&,]. 


APPLICATION  OF  STOCHASTIC  APPROXIMATION  METHODS 
TO  QUANTA!  RESPONSE  PROBLEMS 


Let  the  random  variable  Y  take  on  or.ly  two  values,  unity 
with  the  probability  M(x)  and  zero  with  the  probability  1  -  M(x) . 
This  type  of  a  response  has  been  called  quantal  response.  Let 
there  be  two  real  numbers ,  a  and  b  (a  <  b) ,  such  that 

Y(x)  «  0  for  all  x  <  a 

and  Y(x)  «=  1  for  all  x  >  b 

Assume  that  a  =  0  and  b  *  1.  Then  the  regression  function 
M(x)  will  have  the  following  properties: 


M(x)  =  0 

for 

x  < 

0 

=  f  00 

for 

0  < 

x  <  1 

«  1  . 

for 

x  > 

1 

In  a  neighborhood  of  x  =  0,  the  root  of  the  regression  equation 
M(x)  =  a,  ws  know  that  there  exists  a  small  neighborhood  of  9 
in  which 


(8)  PrClx.  .  -  0|  a  lx  -  6|  and  (x.  .  -  0)(x  -  9)  >  0] 

3+1  3  3+1  3 

the  probability  of  making  an  incorrect  decision  at  x^  is  a 
increasing  function  of  x  as  x  tends  toward  6. 
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Since  Y(x^)  can  take  on  only  the  values  of  zero  and  unity, 
and  assuming  a  ^  1  or  a  ^  0,  then  Pr[Y(x^)  =  a  |  x.  -  ©3  =  0, 
or  the  value  of  the  probability  statement  8  is  unity. 

Suppose,  however,  that  at  each  level  x^  a  sample  of  k  >  1 
Y's  are  taxen.  Since  the  sample  mean 


k  ( 

Y(Xj)  =  Vxj)  wher?  Yi  =  ( 


0  if  no  response  occurs 
1  if  a  response  occurs 


has  the  same  expected  value  as  the  random  variable  Y(x),  the 


recursive  formula  =  xj  +  cj 


probability  one  to  the  same  limit  as  x.., 
K  '  J+l 
for  estimators  I  and  II. 


y(x.)l  will  converge  with 
+  c,[a  -  y(x.) ] 


j 


'j' 


'j' 


Let  us  consider  a  special  application  of  the  general  stochas¬ 
tic  approximation  technique,  that  is,  the  problem  to  which  stochas¬ 
tic  approximations  would  be  most  applicable:  the  quantal  response 

2 

problem  or  sensitivity  testing.  This  is  a  test  in  which  the 
experimenter  wants  to  determine  a  level  of  x  such  that  the 
probability  of  a  response  as  defined  by  the  problem  will  be 
some  preassigned  value,  say  a.  Let  M(x)  be  defined  by  Eq.  1 
where  f(x)  is  monotcnically  increasing  in  its  range.  Let  us  now 
consider  the  upper  and  lower  tolerance  equations,  L^  and 
respectively,  such  that  1  -  2y  percent  of  the-  observed  Y(x)  will 
be  expected  to  fall  between  them.  Let  us  represent  these  by 


2 

A  good  example  would  be  to  determine  that  dosage  of  radia¬ 
tion  to  which  a  specified  laboratory  animal  can  be  subjected  such 
that  the  probability  of  his  death  after  subjection  to  the  dosage 
would  be  less  than  10  percent. 
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T  jt/  \  .  f(x)  -  r  (x) 
\  =  f(x)  T  m  v  '<  ^~L 


and 

Differentiating 


,  ^  f(x)  -  f^(x) 

L2  =  f(x)  -  m  — - *-jL 


dL 


XT  =  f'WCl  +  f  -  ^  f(x)]  >  0 


dx  '  ~y  k 

when  k,  the  sample  size,  and  m  are  selected  so  that 

Pr[f(x)  -  mo—  <  Y(x)  <  f(x)  +  mo—  ]  =  1  -  2y 

and  k  is  sufficiently  large  so  that  m/k  <  1.  Similarly, 


^-£'(x)[l-f+f  £(*)]>„ 

That  is,  both  tolerance  equations  are  monotone  and  increas¬ 
ing  with  x  as  long  as  m/k  s  l . 

In  order  to  gain  further  insight,  consider  Fig.  1  A  desir¬ 
able  quality  of  a  rest  would  be  conditions  such  that  the  length 
of  the  interval  1(9)  =  Cx(L^),  x(L2)]  be  minimized.  The  length 
of  1(9)  depends  upon  slope  and  curvature  of  f(x)  in  the  neighbor¬ 
hood  of  9  and  the  distribution  function  of  Y,  say  G(y  |  x) . 

Since  kY  is  distributed  as 

HOO^tl  -  M(*)]k-k7 

increasing  the  sample  size  k  decreases  che  variance 
Var  (Y  j  x)  = 
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We  note  that  lim  1(G)  =  0  and  that  the  density  g(y  I  x) 
becomes  symmetric  as  k  increases.  Hence,  for  large  samples, 
we  are  assured  that  as  the  trials  proceed  we  will  move  toward 
9  with  a  probability  of  at  least  1  -  Y  at  each  trial  when 
x  p  1(9).  It  is  only  In  those  trials  at  levels  of  x  which  are 
contained  in  1(9)  that  the  probability  the  next  step  will  be 
toward  0  is  less  than  1  -  Y. 

Figure  1  illustrates  that  each  sample  size  fixes  the  tolerance 
equations  L^(x)  and  I^x).  Note  that  the  probability  of  moving 
toward  0  at  each  x^  exceeds  or  is  equal  to  1  -  Y  if  x  i-  1(9). 

Since  cost  and  sample  size  are  usually  directly  related,  it 
would  be  desirable  to  minimize  k,  the  sample  size.  If  |x^  -  ©| 
is  relatively  large,  a  small  sample  size  seems  to  be  desirable. 
When  |x^  -  9|  is  relatively  small,  a  larger  sample  size  requires 
the  length  of  1(0)  to  decrease  and  the  likelihood  that  x  jk  1(0)  to 
increase. 

The  effect  of  increasing  sample  size  with  number  of  trials 
has  been  studied  empirically.  (See  Tables  4—8,  pp.  22—26.) 

THE  MONTE  CARLO  SAMPLING  PLAN  TO  STUDY  THE  RATES 
OF  CONVERGENCE  OF  THE  ESTIMATORS 

Due  to  the  number  of  uncontrollable  parameters  involved, 
perhaps  the  most  practical  means  available  at  this  time  to  study 
convergence  properties  of  the  three  estimators  is  a  Monte  Carlo 
procedure.  The  procedure  used  is  as  follows: 
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FIG.  1.  The  Regression  Function  and  the  Associated  Curve 
Illustrating  fche  Probability  of  Making  an  Incortect  Deci¬ 
sion  in  Direction. 
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1.  Define  M(x),  a,  Ak,  and  k,  where  k  is  the  size  of  the 
sample  taken  at  each  level  of  x,  and  Ak  is  an  increment  which 
will  be  added  to  k  with  increasing  trials. 


2.  Letting  x^  =  a,  compute  M(x^) . 

3.  Generate  k  random  numbers  (r^,  i  =  1,  2  ...  k)  from 
a  uniform  density. 

4.  Compare  each  random  number  r^  with  M(x^).  If  r^  >  M{x^), 

assign  the  value  of  zero  to  Y. .  If  r^  <  M(x^),  assign  the  value 

of  unity  to  Y. . 

x  k 

i 

5.  Compute  y^  =  ^ 


6. 

7. 


Substitute  y^  into  the  recursive  formula  to  determine 


If  (x.  -  x,  .Hx.  ,  -  x.  „)  <  0,  an  increment  of  Ak  is 
J  j-l  j-1  J-2 

added  to  the  sample  size. 


This  procedure  was  programmed  for  the  IBM  704  and  continued 
for  a  desired  number  of  trials.  By  repeating  the  process  several 
times,  various  conclusions  can  be  made . 

In  the  study,  each  test  was  composed  of  a  simulation  of 
forty-nine  trials.  Each  test  was  repeated  one  hundred  times. 
Average  values  for  x^,  x^,  x^,  x2g,  xgg,  x^2>  and  x^g  were 
tabulated  (Tables  4—8)  for  various  values  of  a,  k,  and  Ak. 

In  practice,  the  form  of  M(x)  is  unknown  to  the  experimenter, 
but  it  was  necessary  to  define  the  form  of  M(x)  to  perform  the 
sampling  plan.  Five  forms  6f  M(x)  were  selected  in  order  for 
a  relatively  complete  grid  to  be  placed  over  the  unit  square 
(Fig.  2). 
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These  were 


M1  (x) 

=  x1/4 

0  £ 

X 

< 

1 

£  4x2 

0  < 

X 

4 

1/4 

M2(x) 

-ti- 

4(1  ■ 

-  >02  73 

1/4 

£ 

X 

£  1 

j  2x2 

0  < 

X 

< 

1/2 

M3  (x) 

■U- 

2(1- 

-  x)2 

1/2 

£ 

X 

<  1 

(  4x2 

/3 

0  < 

X 

< 

3/ + 

M4(x) 

Mi- 

4(1  ■ 

-  x)2 

3M 

< 

X 

<  1 

M5(x) 

4 

-  X 

0  < 

X 

< 

1 

The  form  of  dH(y  |  x)/ay  is  defined  by  the  quantal  response 
property  as  the  point  binomial. 

The  values  of  a  considered  here  with  their  associated  0^  for 
i  *  1,  2,  3,  4,  5,  where  ©^  is  the  x  value  of  the  intersection 
of  KL(x)  =  a  and  M_.  (x),  are  tabulated  in  Table  3. 


TABLE  3.  Data  for  Sampling  Procedure 


a 

61 

02 

93 

04 

®5 

.05 

.00006 

.11180 

.15811 

.19365 

.47287 

.10 

.00010 

.15811 

.22361 

.27386 

.56234 

.30 

.00810 

.27543 

.38730 

.47434 

.74008 

.50 

.06250 

.38763 

.50000 

.61237 

.84U90 
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AVERAGE 
VALUE  OF 
Xj  FOR 
400 

INDEPENDENT 

TESTS 


FIG.  3.  The  Effect  of  the  Selection  of  [c/j]  on  the 
Rate  of  Convergence. 


EMS  x  102 
FOR  400 
INDEPENDENT 
TESTS 


FIG.  4.  The  Variation  of  the  Estimator  for  Various  Values  of 
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Various  sample  sizes,  ranging  from  one  to  twenty,  were  used 
in  simulating  the  test.  Also, a  scheme  in  which  the  sample  size 
increases  by  an  increment  of  five  as  the  number  of  trials  in¬ 
creased  was  considered.  When  (x^  -  -  x^  p  <  0,  the 

sample  size  was  increased. 

The  sequence  [c.3  for  the  empirical  study  was  [c/j]  where 
c  =  0.250  and  j  the  trial  number.  The  choice  of  0.250  is 
arbitrary  and  is  not  optimum  for  all  forms  of  M(x). 

The  selection  of  c  =  0.250  was  based  on  the  data  summarized 
in  Fig.  3  and  4.  Three  choices  of  c(c  =  0.125,  0.250,  0.375) 
were  studied  empirically  using  estimator  I.  From  Fig.  3,  a 
"good"  value  of  c  in  terms  of  minimum  error  in  accuracy,  in 
a  sequence  of  form  [c/j],  would  be  in  the  range  of  from  0.250 
to  0.375.  Figure  4  shows  that  the  greater  variability  of  the 
estimator  for  a  small  sample  size  for  c  =  0.375  may  offset  its 
value  as  an  estimator  even  though  it  is  associated  with  the 
minimum  bias  of  the  three  cases  studied  here. 

The  results  of  the  Monte  Carlo  simulation  are  tabulated 
in  Tables  4-8. 
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CONCLUSIONS 

The  most  significant  result  of  the  empirical  study  is  per¬ 
haps  the  apparent  slowness  with  which  estimator  I  converges  to 
6  especially  when  ]x.  -  6)  is  relatively  large.  For  a  test 
which  involves  less  than  fifty  trials,  estimator  I  when  compared 
with  II  and  III  appears  the  least  desirable  in  terms  of  accuracy. 
Figures  5  and  6  illustrate  and  emphasize  the  slowness  of  its 
convergence.  A  good  rule  is  that  unless  the  experimenter  is 
certain  that  the  initial  value,  x^,  is  close  to  8,  he  should 
avoid  using  estimator  _£  (the  Robbins— Monro  stochastic  approxima¬ 
tion  method) . 

On  comparing  estimators  II  and  III,  it  is  apparent  that 
there  are  cases  in  which  II  appears  better  in  terms  of  average 
accuracy  than  III,  and  vice  versa.  When  a  =  0.50,  the  data 
from  Table  7  indicate  that  III  is  slightly  better  for  all  sample 
sizes.  Also,  it  should  be  noted  that  increasing  the  sample  size 
had  little,  effect  in  increasing  the  rate  of  convergence  for  all 
the  estimators,  I,  II,  and  III.  This  is  not  true  for  other 
values  of  a.  However,  with  sample  size  10,  estimator  III  gives 
a  close  approximation  such  that  Ix^g  -  9 1  <  0.006  for  all 
9^  for  i  =  1,  2,  3,  4,  5  If  accuracy  of  the  estimator  is  of 
first  importance  when  estimating  0  for  a  =  0.50,  the  experimenter 
can  be  assured  that  estimator  III  will  on  the  average  give 
results  with  very  good  accuracy. 
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TRIAL  NUMBER  (j! 


FIG.  5.  Comparisons  of  the  Rates  of  Convergence 
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On  comparing  estimators  II  and  III  for  values  of  a  other 
than  a  =  0.50,  it  is  clear  that  II  is  better  for  sample  size 
one,  but  III  becomes  better  with  increasing  sample  size.  The 
data  indicate  that,  for  small  sample  sizes  (1  and  5)  and 
a  =  0.05,  III  overestimates  (see  Fig.  3).  In  order  to  explain 
this,  consider  the  following  rationale. 

Recalling  that  for  sample  size  one 

Y  =  0  if  no  response  occurs 

''  1  if  a  response  occurs 

then  if  a  e  (y.. ,  y^  ^),  a  linear  interpolation  restricts  x^+^ 
such  that  Xj  <  x^+^  <  xj_^  or  xj_i  K  xj+i  <  x j •  Suppose  that 
a  =  0.05,  then  one  would  expect  in  the  neighborhood  of  6  that 
only  one  out  of  twenty  trials  would  result  in  a  response.  Hence 
there  would  occur  on  the  average  twenty  steps  to  the  right  for 

one  to  the  left.  But  when  the  one  does  occur,  x.t1  c  (x,  , ,  x.) 

J+l  j-1  3 

or  x^+1  e  (x. ,  Xj  p,  which  offsets  the  large  step  back  to  the 
right  which  occurs  in  using  I  and  II.  Hence,  one  can  expect 
estimator  III  to  overestimate  toward  the  left  in  the  limit  for 
a  <  0.50  and  sample  size  one.  It  is  assumed  that  a  is  always 
less  than  or  equal  to  0.50.  But  when  a  =  0.50,  the  linear 
interpolation  is  meaningful  and  apparently  there  is  little  or 
no  bias  (see  Table  7), 

As  the  sample  size  increases,  the  error  in  accuracy  for  esti 
mator  III  becomes  smaller,  indicating  that  either  the  symmetry 
of  the  density  dH(y  |  x)/dy  or  the  decrease  in  the  size  of  the 
variance  of  Y  affects  the  convergence  properties  of  III  to  9. 
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Consider  the  function  p^(x),  which  defines  the  probability 
that  the  direction  of  the  next  step  from  x  will  not  be  in  the 
direction  of  0  (see  lower  part  of  Fig.  1). 

That  is 

Pk<x)  = 


x  ±  9 
x  =  9 


where 


r 

0  <  s (x)  <  max  \t  dG(y  |  x),  I  dG(y  |  x) 


[Ja 


fo 


which  in  the  limit  as  k  increases  without  bound  becomes 

p.OO  =  0.  This  is  sufficient  for  the  estimator  x,  ,  =  x.  + 
K  _  j+1  J 

c j (a  "  Yj)  to  converge  in  the  limit  to  0  as  k  tends  to  », 

and  j  tends  to  <=. 


The  results  support  the  following  rules:  For  small  sample 
sizes  and  a_  large,  number  of  trials ,  avoid  using  estimator  III. 
For  sample  sizes  larger  than  five  and  a  small  number  of  trials , 
estimator  III  gives  greater  accuracy. 


The  direct  relationship  between  small  error  in  accuracy  and 
large  sample  sizes  poses  a  problem  of  efficiency  of  estimators, 
that  is,  the  resolving  of  the  problem  of  whether  larger  samples 
with  a  small  number  of  trials  is  more  desirable  than  unit  sample 
sizes  with  a  large  number  of  trials.  The  solution  depends  on 
the  nature  of  the  test  and  must  be  solved  for  the  specific  test, 
hence,  will  not  be  considered  here. 


Increasing  the  sample  size  sequentially  by  increments  of 
five  (see  Ta~ile  8)  does  not,  in  the  cases  studied,  increase 
significantly  the  accuracy  of  the  estimators,  especially  when 
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I  - 

the  comparisons  are  based  on  sample  sizes  larger  than  one.  This 
method  can  be  used  when  sample  sizes  are  not  restrictive  and 
relatively  good  accuracy  is  important.  However,  it  was  observed 
that  sample  sizes  will  in  some  cases  exceed  one  hundred  experi¬ 
mental  units  at  the  forty-ninth  trial.  The  accuracy  of  estimator 
III  is  increased  perhaps  the  most  from  such  a  scheme.  It  is 
important  to  note  that  increasing  the  sample  size  has  little  or 
no  effect  on  the  rate  of  convergence  of  estimators  1  and  II. 

The  main  results  of  this  study  are  that  estimator  I, 
although  of  historical  and  theoretical  importance,  appears 
impractical  for  purposes  of  application,  and  the  choice  of  using 
II  or  III  depends  upon  the  conditions  surrounding  the  tests  and 
must  be  determined  for  each  test. 
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Appendix 


PROPERTIES  OF  THE  SEQUENCE  [c^ 3  ASSOCIATED  WITH  ESTIMATOR  III 


Let  H(y  !  x)  be  a  family  of  distribution  functions  depending 
on  the  real  parameter  x,  and  let 


(9) 


-+» 

M(x)  =  I  ydH(y  |  x) 


be  the  corresponding  regression  function.  It  is  assumed  that 
M(x)  is  unknown  to  the  experimenter,  who  is,  however,  allowed  to 
take  observations  on  H(y  i  x)  for  any  value  of  x. 


The  recursive  formula 


(10)  xj+1  =  Xj  +  Cj(a  -  y^) 

defines  a  sequence  [x^ ]  which  in  the  limit  would  be  desirable 
to  converge  with  probability  one  to  0,  which  is  a  root  of  the 
equation 

(11)  M(x)  =  a 


The  value  c . 

J 

following  rule: 
(12)  c.  = 


C2  = 


If  c 


j-1 


c.  = 
J 


is  an  element  of  a  sequence  defined  by  the 

al 

a2 

a.  for  k  >  2, . then 
k  —  ’ 

(ak  when  at  (y^ ,  yj_1) 

((xj+l  *  xj)/(yj+l  "  yj>  When  a  6  (yj>  yj-l> 
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ak  when  =  ak  and  a  t  (y^+1>  y ■. ) 

(x.+1  -  x.)/(y.+1  -  y.)  when  a  €  (y.+1>  y.) 

ak+l  when  a  *  ^yj+l’  and 

Cj  -  (K.  -  -  y,^) 


When 

a,  is  an  element 
k 

of  a  sequence,  [ak]. 

having  the  following 

properties : 

(a) 

a  >  0 
k 

for  k  =  1,  2,  3,  . . . 

(b) 

ak  >  ak+l 

for  k  =  1,  2,  3,  _ 

(c) 

00 

E  a .  =  ® 

1  3 

(d) 

“  a2  < 

1  J 

It  is  assumed  that  M(x)  is  a  continuous  function  and 

H(y 

!  x)  is  such  that 

i — i 

k! 

v 

fi 

1  x  <  0]  <  Pr[Y  >  a 

I  x  =  0] 

and 

PrtY  >  a 

1  x  >  ©]  >  PrtY  >  a 

!  x  =  e] 

These  conditions  and  the  restrictions  listed  below  are  the  only 
restrictions  placed  on  M(x)  and  H(y  |  x) . 


(a) 

1 M(x) |  i  c  + 

1  d !  x 

c  and  d  are 

re  .jl  constants 

(b) 

f  | y  -  M(x) | 

J.co 

2dH(y 

|  x)  <  o  <  ® 

(c) 

M(x)  <  \  for 

x  <  9, 

M(x)  >  a  for  x  >  9 
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(d)  inf  lM(x)  -  a!  >  0 

~l x_9 1—  62 

for  every  pair  of  numbers 
(6^,  60)  with  0  <  6^  <  6o  <  ® 


The  properties  oZ  the  sequence  Cc^ ]  will  be  presented  in 
the  form  of  seven  lemmas  and  a  single  theorem. 


Lemma  1 


If  the  elements  and  ^  of  the  sequence  Cc^ ]  are  such 

that  e  [a  ]  and  ck  =  ^  -  xk_1)/(yk  -  y^),  then 

0  <  c.  <  c.  . . 
k  k-1 

Proof.  Since  c^e  Ea.l,  c^  >0.  If  yfc  <  a  <  yk-1, 
then  <  xk_1.  Similarly,  if  y^  <  a  <  yfc,  then  x^  <  x^ 
I "  follows  immediately  that  ck  =  (x,^  -  xk  ^)/(yk  -  yk  >  0. 


It  remains  to  be  proved  that  ck  <  ck_^.  Since  xk  =  xk  ^  + 

ck-i(a  -  W’  we  can  write  ck  =  ck-i(a  -  yk~i)/(yk  -  yk-i>- 
Noting  that  both  yk  <  a  <  yk  ^  and  yk  ^  <  a  <  yk  imply  that 

0  <  (a  -  yk-1)/(yk  ■  yk_^)  K  it  can  f-oncluded  that 

ck  <  ck  It  should  be  noted  that  if  Xj^  <  x^^  then  yk  ^  <  yk 

cannot  be  true.  This  follows  immediately  from  the  recursive 

formula,  Eq.  10. 


Lemma  2 

For  every  k  such  that  ck  =  (xfc  -  xk_1)/(yk  -  yk_j)  and 

Ck+1  =  ^Xk+1  "  xk'^yk+l  ~  yk^  *  Ck+1  <  Ck' 

Proof.  From  the  proof  of  Lemma  1,  we  know  that  ck+^  = 
ck (a  “  yk)/(yk+i  ‘  yk).  since  ck  >  0  and  0  <  (a  -  yfc)/ 

(yk+i  -  yk)  <  1,  it  follows  that  ck+1  <  cfc. 
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It  should  be  noted  that  in  general  c.+^  Is  not  less  than  c. 
for  all  j  =  1,  2,  . . 

Lemma  3 

For  each  k  and  J  the  probability  that  c.  =  a,  for  all  j 

J  k 

2  J  is  zero. 

Proof.  Let  c.  =  a.  for  all  j  .>  J,  where  j  =  1,  2,  ...  . 

.  J  K  . 

The  sequence  [x^l  is  monotone  which  converges  to  a  finite  value, 

say  A,  if  the  sequence  is  bounded,  and  diverges  to  either  -  ® 

or  +  ®  if  unbounded. 

Let  CxJ  be  non- increasing  and  bounded  below  by  its  limit  A. 

Then  for  each  j  >  J  there  exists  an  e^  >  0  such  that  x.  «=  A  + 

e.a,.  The  sequence  [e.]  is  a  non- increasing  sequence  of  positive 
J  K  J 

elements  such  that  lim  e.  =0. 

j-®  J 

Clearly  then, 


Simplifying, 


0  <  x.  , .  -  A  <  e.a. 
J-rl  j  k 


0  .5  x.  -  A  +  a  (a  -  y.)  £  e.a 


0  £  e .  a.  +  a.  (a  -  y . )  <  e .  a. 
j  k  k'  3y  j  k 

0  <  e .  +  (a*-y.)<e. 

J  J 

a  <  y.  5  a  +  e. 

7J  J 


Let  us  now  consider  the  probability  of  such  an  event,  that  is, 
PrCa  <  Yj  <  a  +  e j ] .  If  H(y  |  x)  is  continuous,  then, as 
j  -  *  and  e  -  0,  PrCa  <  Y.  <  a  +  e . ]  -  0.  However,  if  H(y  I  x) 
is  discrete,  Prta  £  Y^  £  a  +  e ^ ]  may  not  necessarily  converge  to 
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zero  as  e.  -  0.  But  Co.  £  Y.  <  a  +  e.  ]  must  hold  for  all  j 
J  J  J 

greater  than  that  one  for  which  the  inequality  13  holds.  Clearly, 
as  e^  tends  to  zero,  the  probability  of  such  an  event  is 

E  PrCY  =  a]  <  n J  max  (Pr[Y.  =  0])1  =  0 
1  J  l(A<x<xn  J  ) 

A  similar  argument  holds  when  the  sequence  [x^ ]  is  non-decreasing 
and  bounded. 

Suppose  the  sequence  [x.]  is  unbounded,  then  either  lim  x.  =  « 

j  J-*00  J 

or  lim  x.  =  -“.  In  order  for  these  events  to  occur,  y.  <  a  or 
j-®  J  j 

y j  >  a  for  all  j  >  J,  respectively.  Let  us  investigate  the  prob¬ 
ability  of  such  events,  that  is,  Pr[Y^  >  a,  Yj+^  >  a,  ...J 

=  Pr[lim  x.  =  -»]  and  Pr[Y.  <  a,  Y.  -  <  a,  ...]  =  PrClira  x.  =  4®] 
j-®  J  3  3+i-  j-*®  J 

Consider  the  latter  of  the  two  cases. 


PrlX  <  a,  Yj+1  <  a,  ...]  =  PrtY^  <  a]  PrCYj+1  <  a 


pr[Y.+L  <  a 


Yj  < 


Yj  <  aJ  .  . . , 

'*  Yj+L-1  <  a] 


=  n  Pr[Y.  „  <  a] 


j+L 


There  exists  only  a  finite  number  of  L  such  that  x  <  9.  It 
follows  then  that 


Pr[Y..  <  a,  Yj+1  <  a, 


..]  <  0  Pr[Y.  <  a 


Xj  >  9] 


£  n  P^CY  <  a  |  x  =  0] 

1 

=  0 

A  similar  argument  holds  when  lim  x.  =  ■*,  and  the  lemma  is 

j-**  3 

proved. 

Lemma  4 


If  c  -  (x.  -  x.  ,)/(y.  -  y.  for  all  j  >  J,  then  lim 
J  J  J  "*  L  J  J  1  j  -<*> 

(x.  -x.  .)  *  0  almost  surely  is  true  for  all  c.. 

V  J  J-l  3 
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Proof.  Suppose  x^.  >  x^  ^  and  a  <  y9^ .  In  order  that  c^ 

have  the  form  restricted  by  the  hypothesis  of  the  lemma, 

^2j  1  <  a  <  y2j  ^or  a^"*'  j  —  The  sequences  £x2j_]^  and  Lx2j  ] 

are  monotone;  the  first  is  increasing,  the  second  is  decreasing. 

Since  x.+^  is  obtained  by  a  linear  interpolation  between  x..  and 

Xj  both  sequences  are  bounded  above  and  below.  Let 

lim  x9.  .  =  A  and  lim  x„.  =  B.  Let  B  -  A  =  A,  where  A  >  0.  Then 
j  — *00  Z  j  *"  *  j  -*O0  Z  J 

for  every  j  >  J,  there  exists  an  e^,  ^  >  0  such  that  x^  ^  = 

A  -  e,^  The  sequence  [e^  ^  is  monotonically  decreasing  and 
converges  to  zero.  With  each  j  there  exists  an  e2_.  such  that 
x,^.  =  B  +  e2...  ^he  se£3uence  ^-e2j^  mon°tonically  decreasing 
and  converges  to  z=ro  as  j  increases  without  bound.  Consider 

(!4)  x2j+1  =  x2.  +  [(x2.  -  x2j_1)/(y2.  -  y2j_1)3(a  -  y2j) 


B  +  e2j  +  (B  +  e2j  -  A  +  e2j_^) 

•  -  y2j)/(y2j  -  y2j.!)3 


=  B  +  A [ (a  y2j)/(y2j  -  y2j.1)J  +  e2j 

+  (e2j  +  e2j_l)^a  '  y2j^/(y2j  ■  y2j-l^ 


Taking  the  limit  of  both  sides, 


jis  x2j+l  =  B  -  AC}iS  (y2j  -  a)/(y2j  ■  y2j-l)] 


it  is  clear  that 

PrClim  (Y,.  -  a)/(Y,  -  Y„.  .)  =  D]  =  0  for  any  D 
Since  the  left  side  of  Eq.  14  converges  and  [lim  (Y„.  -  a)/ 

j  -»®  Z  "J 

(Y„.  -  Y„.  ,)]  almost  surely  does  not  exist,  A  =  0,  that  is  A  =  B. 
-  J  4  J  ~ 

It  follows  immediately,  then,  that  for  almost  all  c.  lim  (x.  -  x  ) 

J  j  -«=  J  J  ”1 

=  0,  the  desired  result. 
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Lemma  5 

Let  the  sequence  [z^]  be  the  union  of  all  subsequences  of 
[Z,  !  such  that  lim  z  =  ».  where  Z,  is  the  number  of  times  that 

K.  J--.tr  K.  K 

the  kth  element  of  [a  ]  appears  in  the  sequence  [c.].  Then, 

K  J 

Pr[lim  z,  =  “]  =  0. 

^-<£0  k 

Proof.  From  Lemma  3.  we  know  that  for  each  k.  7.  -t«?  almost 

k 

always  finite.  Since  the  sum  or  a  denumerable  number  of  sets 
of  measure  zero  is  also  of  measure  zero,  we  can  conclude  that 
the  probability  of  at  least  one  element  of  the  sequence  of 
infinite  terms  in  [Z^]  being  infinite  is  also  zero.  This  still 
does  not  assure  us  that  the  sequence  [zk]  is  almost  always 
bounded. 

Let  lim  \  =  m ■  Then  for  each  L  >  0,  there  must  exist  a 
k  such  that  z^  >  L.  Consider  the  probability  of  such  an 
event,  that  is, 

Pr[zk  >  L]  =  Pr[Yx  >  a,  . . . ,  >  a] 

or  Fr[zk  >  L]  =  Pr[Yx  <  a,  . . . ,  YL  <  a] 

But,  from  the  proof  of  Lemma  3,  we  know 

lim  PrL'Y,  >  a,  . . . ,  Y  >  a]  =  0 

L-KX5  i-  L 

or  lim  Pr[Y1  <  a,  . . . ,  YT  <  a]  =  0 

T  — XP  i*  la 

Hence,  we  can  conclude 

Pr[lim  z.  =  »]  =  0 
L-=  k 

That  is,  the  sequence  C zk^  almost  surely  a  bounded  sequence. 
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Lemma  6 

For  every  J,  the  probability  that  =  (x^  -  ^)  / (y^  -  y^  ^ 

for  all  j  >  J  is  zero. 


Proof.  Suppose  each  element  of  the  sequence  [c^ ]  takes  on 
the  form  defined  by  the  hypothesis  of  the  lemma.  Then  the 
sequences  [x2j]  and  ^X2j  are  monotonfcaHy  decreasing  and 
increasing  sequences,  respectively,  when  y^  ^  <  a  <  y^  for 
all  j  >  J.  Similarly,  the  sequences  are  monotonically  increasing 
and  decreasing,  respectively,  if  y ^  <  a  <  y2j  ^  for  all  j  >  J. 

By  Lemma  4,  we  know  that  both  these  sequences  converge  to  a 
common  limit,  A.  Consider  a  neighborhood  of  A,  say  v(A),  such 
that  at  least  one  of  the  following  probabilities  is  less  than 
unity  for  all  x  e  v(A):  Pr[Y0.  >  a  I  x  e  v(A)]  and 
PrCx^  ^  <  a  !  x  e  v(A)],  The  existence  of  v(A)  is  assured  by 
the  continuity  of  M(x) .  Suppose  that  at  least  one  of  the  proba¬ 
bilities  above  is  identically  equal  to  unity,  or  at  least  in  the 
limit  equal  to  unity  as  j  -  “  and  x  -  A.  It  is  assumed  that  the 
variance  of  the  random  variable  Y  is  finite  for  all  values  of 
x  and  that  M(x)  is  continuous. f  Then  if 

lira  PrCY„.  >  a  |  x„.  e  v (A) ]  =  1 
x-A  2J  2J 

j-® 


this  must  imply 

lim  Pr[Y„.  -  <  a  I  x_.  .  e  v(A)]  =  0 
-:-A  -1 

j  — 

and  vice  versa. 

T,et  there  be  a  J  such  that  c.  =  (x.  -  x.  ,)/(y.  -  y,  ,) 

J  J  J-l  J  7J-l 

for  all  j  >  J.  Consider  the  probability  of  such  an  event,  that  is. 


40 


NAVWEPg  REPORT  7837 


Pr[Y1  <  a,  . . .,  Y2,_1  <  a,  .  ..J  Pr[Y2  >  a,  . . Y^  >  a,  ...J 

<  lim(  max  ?rTY0.  ,  <  a])"  lim(  max  Pr[Y0.  >  a])J 
j-  x.ev  2j_1  x.ev  2j 

J  J 

This  is  true  since  in  v(A)  either  max  PrCY2j~l  <  or 

max  Pr[Y2j  >  a]  must  be  less  than  unity.  Therefore,  at  least 

one  of  the  limits  will  Oe  identically  zero. 

Lemma  7 

Let  the  sequence  [z^]  be  the  union  of  all  subsequences  of 

[z,  ]  such  that  lim  z,  =  ®  where  Z.  is  the  number  of  elements  of 
k  k-«=  "  k 

the  sequence  [  c .  ]  having  the  form  (x.  -  x.  )/(y.  -  y.  ,)  which 

J  J  J  ”  J  J  ***■*■  ^ 

lie  between  any  two  successive  members  of  the  sequence  [a^]. 

Then  pr[ lim  z,  =  “]  =  0. 
k-“  k 

Proof.  Let  lim  z.  =  “,  then  for  each  2L  >  0  there  exists 

j  -«r  j 

a  i  such  that  z.  >  2L.  Let  us  now  consider  the  probability 
J 

of  such  an  event,  that  is,  PrfY^  <  a,  Y9  >  a,  •••r^2k-l  <  a> 

Y2k  >  a,  • • • >  ^2L~1  <  a>  i2L  >  a]'  But’  £rom  tne  proof  o£ 

Lemma  3,  we  know  that  lim  Pr[Y^  <  a,  ...,  Y2L_^]  Pr[Y2  >  a,  ..., 

Y„,]  =  0.  It  follows  then  that  Pr[lim  z.  =  •»]  =  0. 

2L  j-«  3 

Theorem  1 

Any  given  sequence  [c^]  is  almost  surely  a  member  of  the 
class  of  sequences  [b.]  where  [b^  3  is  defined  by  the  following 
properties : 


(a) 

b.  »  0 

J 

for  all  j 

(b) 

CO 

Z  b.  =  » 

1  J 

(c) 

§  b2  <  » 

1  J 
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Proof.  Consider  any  sequence  Lc. ]  as  defined  in  rule  12. 

By  Lemma  1,  each  element  of  the  sequence  is  necessarily  positive. 
Condition  (a)  is  satisfied. 

Lemma  3.  Lemma  6,  and  Lemma  7  assure  us  that  every  element 
of  the  sequence  [a^ ]  is  almost  surely  contained  in  [c^].  There¬ 
fore,  since  c.  >  0  for  all  j, 

£  c.  >  £  a.,  but  £  a.  =  ®,  then  £  c.  =  “ 

1  J  1  J  1  J  1  J 

Hence,  condition  (b)  is  satisfied. 

In  order  to  show  that  the  sequence  [c ^ ]  satisfies  condition 
(c) ,  consider  the  following  infinite  sum: 

s  2  2  ,  2  2,2  2  ,  2 

l  Cj  “l  a2  ‘ • •  a2  +  cn  +  c12  +  . . .  +  c1M^  +  a3 

+  ...  +  a3  +  c21  +  ...  +  c2M^,  ate. 

Where  a^  occurs  once,  a2  occurs  k2  times,  a3  occurs  k3  times, 

etc.  By  Lemma  5,  the  sequence  [k. ]  is  almost  surely  bounded. 

By  Lemma  7,  the  sequence  [M. ]  is  almost  surely  bounded.  Let 

k  =  max  k.  and  M.  «  max  M. . 
j  J  J  j  J 

m  ? 

If  the  sum  £  cl  is  convergent,  it  is  absolutely  convergent, 

1  3 

The  rearrangement  of  terms  will  not  affect  the  convergence  or 
the  sum.  Hence, 

00  2  CO  2  CO  2  „  ,  2  2 

£  c7  <  k  £  a:  +  M  £  af  =  (k  +  M)  £  af  <  00 

lj  1  J  1  J  lj 

which  is  the  desired  result,  condition  (c) . 

What  is  unusual  about  the  theorem  is  that  the  conditions 


2 

cf  >  0 

3 
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CO  2 

(b)  £  cf  <  “ 

13 

(c)  Z  c.  =  00 
13 

are  identical  to  those  required  by  Blum  (Ref.  4)  in  his  theorem 
which  proves  that  the  limit  point  of  the  sequence  Cx^ ]  is  0  with 
probability  one  for  estimator  I.  The  theorem  can  be  stated  as 
follows:  Let  M(x)  be  the  regression  function  corresponding  to 
the  family  H(y  I  x) .  Assume  that  M(x)  is  a  Lebesque-measurable 
function  satisfying 

(a)  jM(x) |  £  c  +  d  |x| 

C***  2  2 

(b)  I  ly  -  M(x)  |  dH(y  |  x)  <  cr  <  °> 

(c)  M(x)  <  a  for  x  <  9,  M(x)  >  a  for  x  >  9 

(d)  inf  |M(x)  -  a!  >  0 

-1  <U-s!<  &2 

for  every  pair  of  numbers 
(6 1 ,  &2)  with  0<6^<62<00 
Let  [b.]  be  a  sequence  of  positive  numbers  such  that 

(e)  2  b,  =  ® 

13 

CO  2 

(f)  Z  bf  <  » 

1  3 

Let  x^  be  an  arbitrary  number.  Define  a  sequence  of  random 
variables  recursively  by 


(g)  x.  ,  =  x.  +  b.  (a  -  y. ) 
6  J+l  J  J  1 


where  is  a  random  variable  distributed  according  to  H(y  I  x) , 
Then  x^  converges  to  9  with  probability  one. 
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